3D-QSAR-Based Pharmacophore Modeling, Virtual Screening, and Molecular Docking Studies for Identification of Tubulin Inhibitors with Potential Anticancer Activity

In this study, we aimed to develop a pharmacophore-based three-dimensional quantitative structure activity relationship (3D-QSAR) for a set including sixty-two cytotoxic quinolines (1-62) as anticancer agents with tubulin inhibitory activity. A total of 279 pharmacophore hypotheses were generated based on the survival score to build QSAR models. A six-point pharmacophore model (AAARRR.1061) was identified as the best model which consisted of three hydrogen bond acceptors (A) and three aromatic ring (R) features. The model showed a high correlation coefficient (R2 = 0.865), cross-validation coefficient (Q2 = 0.718), and F value (72.3). The best pharmacophore model was then validated by the Y-Randomization test and ROC-AUC analysis. The generated 3D contour maps were used to reveal the structure activity relationship of the compounds. The IBScreen database was screened against AAARRR.1061, and after calculating ADMET properties, 10 compounds were selected for further docking study. Molecular docking analysis showed that compound STOCK2S-23597 with the highest docking score (-10.948 kcal/mol) had hydrophobic interactions and can form four hydrogen bonds with active site residues.


Introduction
Cancer is a disease identified by the uncontrolled proliferation of cells. As cancer is the second deadliest disease worldwide, discovering new methods and drugs to treat this disease is very important [1,2]. Nowadays, scientists are looking to find many novel targets for developing anticancer drugs, due to the critical need for new anticancer agents. Microtubules are polymers composed of αand β-tubulin heterodimers [3]. They control several cellular functions including motility regulation, cell signaling, secretion, and cell architecture in interphase [4,5]. Thereby, tubulin and microtubules are important targets for antitumor therapy [6]. The αand β-tubulin het-erodimers are in dynamic equilibrium with microtubules. The impairment of the dynamic equilibrium of microtubules leads to mitotic arrest and accordingly apoptosis [7,8]. Therefore, microtubule targeting agents that interfere with dynamic microtubules can be effective in the treatment of cancer. In general, the binding sites for paclitaxel, vinblastine, and colchicine are well determined in tubulin [9][10][11].
Agents that bind to the vinca alkaloid binding site (e.g., Vincristine) or the colchicine binding site (e.g., colchicine and podophyllotoxin) are determined as microtubule destabilizing agents or inhibitors of tubulin assembly. In contrast, agents that bind to the paclitaxel binding site (e.g., paclitaxel) are tubulin promotors or microtubule stabilizing agents [12,13].
Antimitotic agents such as vinca alkaloids and taxanes have been used for the clinical treatment of different cancerous patients over the past decades [14,15]. However, the use of these agents is limited due to drug resistance and associated side effects [16,17]. Recently, different small molecules have been designed as tubulin inhibitors and anticancer agents [5,8,12,18,19].
Despite the design and synthesis of promising compounds, antiproliferative and tubulin inhibitory activities of these compounds could not be confirmed until experimentally evaluated, which are time-consuming and expensive besides ethical limitations in using and sacrificing animals. Therefore, computational methods (in silico tools) such as pharmacophore modeling, drug screening, and design are essential for activity prediction and greatly reduce the time and cost of drug development [20,21].
One advantage of in silico methodologies is that they can be used to identify new compounds with desirable properties as "druggable" targets before they are synthesized and thus reduce the need for time-consuming and expensive animal and in vitro laboratory work [21,22].
The relationships between the biological activity and physicochemical properties of a set of compounds could be analyzed with three-dimensional quantitative structure activity relationships (3D-QSARs). QSAR techniques by generating three-dimensional alignment of molecules facilitate design and synthesis of new compounds with superior activ-ity [23]. In this paper, quinolines with cytotoxic activities were used to develop new potent anticancer agents. We selected some cytotoxic quinolines (1-62) as tubulin inhibitors from our previous works [5,8], generated a training set and test set, and created 279 pharmacophore models with

Data Set and Ligand Preparation.
For the preparation of common pharmacophore through 3D-QSAR studies, a set including sixty-two quinolines from our previous studies, with cytotoxic activity against A2780 (human ovarian carcinoma) cell line, was selected and the pIC 50 (pIC 50 = − logIC 50 ) values were calculated [24]. The data set was randomly divided into training and test sets for generation and validation of the model, respectively [25]. The 3D structures of ligands were generated using the builder panel in Maestro and successively optimized using LigPrep module (v4.3, Schrodinger 2016-1) [26]. The energy minimization was done using OPLS_2005 (optimized potentials for liquid simulations) with an implicit distancedependent dielectric solvation treatment [27].
2.2. Pharmacophore 3D-QSAR Modeling. For the generation of pharmacophore and 3D-QSAR models for anticancer agents, Phase (v4.3, Schrodinger 2016-1 was used [28]. The data set ligands were categorized into active, with the threshold of pIC 50 > 5:5, and inactive, with the threshold of pIC 50 < 4:7 for the generation of common pharmacophore hypotheses [29,30]. The default settings were used to generate acceptable conformations, and a maximum of 100 conformers were generated. Alignment was done, and a maximum of one conformer was retained for every ligand. Random selection was used for assigning training and test set for 62 compounds. Twelve (12) compounds were selected as a test set, and the remaining (50 compounds) were used as a training set. Pharmacophore sites were produced based on the pharmacophore features in the Phase module. There are six built-in pharmacological features in Phase, namely, hydrogen bond receptor (A), hydrogen bond donor (D), hydrophobic group (H), negatively charged group (N), positively charged group (P), and aromatic ring (R) [31]. In the generated hypotheses, six common sites were found for all selected compounds. Six of the best resulting hypotheses were scored and ranked by their vector, volume, site scores, survival scores, and survival actives showed in Table 1 [28]. AAARRR.1061 hypothesis which consisted of three hydrogen bond acceptors (A) and three aromatic ring (R) features was selected as the best model for further study (Figure 1).           The distance and angles between different sites of the model AAARRR.1061 are shown in Table 2 and Table S1 (supplementary data), respectively. The alignment of active and inactive compounds with the generated common pharmacophore was shown in Figures 2(a) and 2(b). PLS statistical parameters of the model AAARRR.1061 were shown in Table 3. Structure of compounds, experimental and predicted inhibitory activities (pIC 50 values), residual values, and fitness score of all the ligands were reported in Table 4.

Model
Validation. The last step of developing the QSAR model was model validation [32,33]. The developed pharmacophore hypothesis was validated using potent approaches like the Y-Randomization test and ROC-AUC analysis [34]. The statistical parameters, including the squared correlation coefficient (R 2 ), cross-validation (leave one out) Q 2 , the standard deviation of regression (SD), Pearson's correlation coefficient (Pearson's R), statistical significance (P), root mean square error (RMSE), and variance ratio (F), were shown in Table 3 [35].

ROC-AUC Analysis.
To evaluate our hypothesis, receiver operating characteristic (ROC) curve analysis was also performed using the MedCalc statistical software (http://www.medcalc.org). In ROC analysis, the ability of the obtained pharmacophore model was indicated with the area under the curve (AUC) to distinguish a list of compounds as active or inactive compounds in terms of two parameters, sensitivity, and specificity [24].

Drug-Likeness Filtration and Virtual Screening. The
IBScreen database containing 211432 compounds was selected. The drug-likeness behavior of the database compounds was predicted by using QikProp version 4.3 (Schrödinger). The compounds were employed for the calculation of pharmacokinetic parameters by QikProp v4.3. Physiochemical descriptors and pharmaceutically relevant properties of compounds were evaluated to analyze druggable properties [27,38]. Lipinski's rule of five was used to filter the compounds with drug-like properties [39]. To identify the best match molecules, the AAARRR.1061 hypothesis was applied for screening the IBScreen database with drug-like properties. Finally, we selected the compounds with the pIC 50 value of more than 4 which were considered as the most active compounds, and then, further screening of these compounds is done by ADMET (absorption, distribution, metabolism, elimination, and toxicity) properties using QikProp version 4.3. The compounds complied with Lipinski's rule, with good predicted activity and good ADMET properties, were selected for molecular docking studies [24,32].

Molecular
Docking. The docking study was performed using Glide module in Schrodinger suite 2016-1. The crystal structure of tubulin in complex with colchicine (PDB Code: 4O2B) was obtained from protein data bank Brookhaven (Protein Data Bank (PDB) at Brookhaven National Laboratory). The protein was prepared using the protein preparation wizard in Schrodinger [40]. All the water molecules were deleted, hydrogen atoms were added, and energy minimization was performed using the OPLS_2005 force field [26]. The active site was defined with a radius of 15 Å around the ligand present in the crystal structure. The grid box was generated at a centroid of the active site. The compounds were docked into the catalytic domain of tubulin protein (PDB Code: 4O2B) using Grid-based Ligand Docking [41]. The best docked structures were identified using Dock score and Glide energy ( Figure S1: supplementary data). The compound STOCK2S-23597 with the lowest docking energy was selected for further studies (Figure 3).

Results and Discussion
3.1. Pharmacophore and 3D-QSAR Models. For the development of the pharmacophore model, the Phase (v4.3) module of Schrodinger 2016-1 was used [28]. All the 62 compounds were randomly divided into a training set (50 compounds), to identify the common pharmacophore hypothesis, and a test set (12 compounds). A maximum of six sites was chosen to find the optimum combination of features to the most active compounds in the data set. In total, 279 common pharmacophore hypotheses were generated using different combinations of variants, such as AAARRR, AAAHRR, AAHRRR, AHHRRR, AHHHRR, and AAHHRR with survival scores ranging between 3.48 and 3.87. Six of the best hypotheses were shown in Table 1.
The best-fitted model AAARRR.1061 consists of three hydrogen bond acceptors (A) and three aromatic ring (R) features, and regression scores of this (AAARRR.1061) pharmacophore hypothesis were further analyzed by PLS in the Phase module using 6 PLS factors (Table 3).
In the regression model, R 2 was used to describe the fitness of data, the correlation coefficient (Q 2 ) was used to check the external predictability, and the significance of the model was measured by the Fisher ratio (F) [31,36,37]. Thus, the best QSAR model was chosen based on maximum survival score, good statistical value, good predictive power, and lowest relative conformational energy. Scatter plots for experimental and predicted activities of ligands showed a significant linear correlation of training and test set compounds (Figures 4(a) and 4(b)).
The reliability and predictability of the common pharmacophore model, AAARRR.1061, based on active compounds were determined. The PLS of four was selected as the best model ( Table 3). The relevance of the model was displayed by the regression coefficient (R 2 = 0:865) of the training and cross-validation coefficient (Q 2 = 0:718). The stability of the generated models ranged from 0.454 to 0.941. The

BioMed Research International
These results demonstrated that not only the pharmacophore model AAARRR.1061 could efficiently estimate the cytotoxic activity of the training set, but it also was fitted for the test set. Hence, AAARRR.1061 was validated to be a reliable pharmacophore mode to recognize cytotoxic agents with the ability to inhibit the tubulin polymerization, and then, it will be used to screen the database.

Y-Randomization
Test. Validation of the model was performed by applying Y-Randomization. We built ten random and repeated all procedures to develop a model (Table S2). All the R 2 and Q 2 of the generated models from random were lower (less than 0.26) compared to the original model. This indicated that our model was not generated by chance.

ROC-AUC Analysis.
Additional validation of the common pharmacophore model, AAARRR.1061, was performed using the AUC of the ROC curve. The ROC curve obtained for the validation showed an excellent AUC value of 0.916 ( Figure 5), indicating that the model differentiated the active compounds from the inactive ones efficiently (P < 0:001). The sensitivity, specificity, and accuracy of the model were 74.67, 86.09, and 80.03%, respectively.

3D-QSAR Contour Map Analysis.
To analyze 3D-QSAR results, the model was superimposed on the most active ligand (compound 22) and the least active ligand (compound 62). The generated contour plots (Figures 6(a)-6(f)) showed the correlation of structural properties of compounds including electron-withdrawing, hydrophobic, and H-bond donor properties concerning their activity. Blue cubes indicated favorable regions while red cubes indicated unfavorable regions for biological activity [42,43].
The hydrogen-bond donor nature was compared for the most active compound 22 (Figure 6(a)) and the least active compound 62 (Figure 6(b)). In Figure 6(a), blue cubes were observed at regions lied over the amine group present at position 4 of the quinoline ring. On the other hand, in the least active compound 62 without an amino group at the same steric position (Figure 6(b)), no blue cube was observed in the same region. Therefore, the presence of N-aryl with hydrogen donor amine group was vital for the cytotoxicity and tubulin inhibitory activity. This assumption was further supported by the low activity of compounds 65-71, which do not have N-aryl at position 4 of the quinoline ring.
Figures 6(c) and 6(d) showed the favorable and unfavorable hydrophobic features for the most active compound and least active compound. Figure 6(c) revealed that the blue cubes were generated around the hydrophobic arylstyryl at position 2 and N-aryl at position 4 of the quinoline core were essential for anticancer activity.
In Figure 6(d), red cubes were generated at position 4 of the quinoline core of the least active compound. In this compound, a chloro substituent was present at this region instead of the hydrophobic N-aryl group. Thus, the results revealed that red colored unfavorable regions at these positions could be responsible for the decrease of activity. This was also confirmed by less activity of compounds 65-71 possessing chloro group at position 4 of the quinoline ring.
In Figure 6(e), blue cubes were observed at the para position of N-aryl indicating the preference of electronwithdrawing groups at this position (the presence of an electronegative atom, such as oxygen or nitrogen, was desirable because of the inductive electron-withdrawing effect). Also, blue cubes were observed at the para position of the styryl 3.4. Virtual Screening. First, the IBScreen database containing 211432 compounds was selected. Lipinski's rule of five was used to scrutinize these compounds. The applied filter gave a total of 183084 compounds. The validated 3D-QSAR pharmacophore model AAARRR.1061 was used as a 3D structural query for retrieving potent compounds from the IBScreen database [24]. A thousand hits were obtained in this step for a match to hypothesis AAARRR.1061. Compounds with a pIC 50 value of more than 4 were selected as the active compounds, and hence, 34 compounds were obtained which were further subjected to filtering by applying ADMET properties. The result showed that 10 hits were retrieved with good ADMET properties. The descriptors used in ADMET prediction and criteria for hit selection were shown in Table S3.
Structures of the top 10 best-fit molecules were shown in figure S1 (supplementary data), and molecular properties and ADMET properties were depicted in Table S3 and S4, respectively. 17 BioMed Research International 3.5. Molecular Docking. Above 10 hits obtained from the virtual screening process were subjected to molecular docking studies. A tubulin complex with colchicine was chosen as the target protein for molecular docking using the Glide in Schrodinger 2016. Molecular docking studies were performed to predict the binding conformation of the compounds. The top 10 compounds retrieved by ADMET properties were docked into the binding site of tubulin. Binding interactions and binding energies of these 10 molecules with tubulin were shown in Table S5. Dock scores for the screened compounds ranged between -10.948 (STOCK2S-23597) and -5.991 (STOCK2S-05500).
The binding mode of the most active compound (22) is exhibited in the figure S2. In this compound nitrogen atom of the aniline ring interacted with Thrα 179, the methoxy group, interacted with Lysβ 254, and the nitro group formed a hydrogen bond with Argα 221. These results are consistent with the finding of 3D-QSAR study which showed nitrogen atom of the aniline ring, and the nitro group has key roles in the activity of compound 22.

Conclusion
In this study, 279 pharmacophore models were generated based on a series of quinolines as anticancer agents and tubulin inhibitors. A six-point pharmacophore model (AAARRR.1061) was identified as the best model which consisted of three hydrogen bond acceptors (A) and three aromatic ring (R) features and was then validated by Y-Randomization test and ROC-AUC analysis. The model showed a high correlation coefficient (R 2 = 0:865), crossvalidation coefficient (Q 2 = 0:718), F value (72.3), and a P value of 5.278e-019 at 6 component PLS level. The contour maps obtained for the model with the most active and the least active compounds confirmed that the presence of Naryl with hydrogen donor amine group at position 4 of the quinoline ring, the arylstyryl hydrophobic group at position 2, and N-aryl at position 4 of the quinoline core and the presence electron-withdrawing group at the para position of arylstyryl group were vital for the cytotoxicity and tubulin inhibitory activity.
AAARRR.1061 was used as a 3D query to screen the IBScreen database, and we obtained 1000 compounds. Compounds with a pIC 50 value of more than 4 (34 compounds) were selected as the most active compounds. After applying ADMET properties, 10 compounds were selected for further docking studies. Ultimately, compound STOCK2S-23597 with the highest docking score (-10.948 kcal/mol) was selected as a potent tubulin inhibitor. It formed four hydrogen bonds and hydrophobic interactions with tubulin active site residues.
The obtained results suggested that the proposed 3D-QSAR model and hits obtained on virtual screening of the database have provided new chemical starting points for design and development of novel tubulin targeting agents.

Data Availability
Data are available upon a reasonable request from the corresponding authors.

Conflicts of Interest
The authors declare that they have no conflicts of interest.

Acknowledgments
We thank Research Deputy of Mashhad University of Medical Sciences for financial support of this study (as part of thesis of Salimeh Mirzaei). Figure S1: structures of top 10 best-fit molecules from the IBScreen database. Figure S2: 2D-ligand interaction diagram of compound 22 in the catalytic pocket of 4O2B. Table S1: intersite angles between the pharmacophoric sites of AAARRR.1061. Table S2: R 2 and Q 2 values after several Y-Randomisation test. Table S3: the molecular property descriptors used in ADMET prediction. Table S4: the descriptors used in ADMET prediction.